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ABSTRACT 

When  serially  reusable  multiunit  resources  are  shared  among 
many  processes, each  of  which  has  exclucive  control  over  some  re-  - 
source  units, it  is  possible  for  deadlocks  to  happen. The  work  of 
Holt,  [Holt,71lJ],  stated  the  problem  of  deadlock  detection  as  a 
directed  multigraph  problem. In  this  paper  we  examine  the  possibi- 
lity for  parallel  algorithms  for  deadlock  detection. Although  many 
graph  problems  have  efficient  parallel  solutions  (in  parallel  po- 
lylogarithmic  time, by  using  only  a  polynomial  number  of  processors), 
we  present  strong  evidence  that  this  is  not  the  case  for  the  gene- 
ral deadlock  detection  problem. We  show  that  the  problem  is  comple- 
te in  P  under  log-space  reductions  and  thus  probably  not  efficien- 
tly parallelizable. Fortunately, when  the  problem  is  restricted (e.g . 
single-unit  requests  of  processes  or  single-unit  resources) , then 
it  falls  in  NC.We  present  efficient  parallel  algorithms  for  the 
restricted  versions  of  the  deadlock  detection  problem. 


1 .  Introduction 

When    serially  reusable  resources  are  shared  among  a  popula- 
tion of  processes, each  of  which  maintains  exclusive  control  over 
particular  resources  allocated  to  that  process, it  is  possible  for 
deadlock   to  develop  in  which  some  processes  will  never  be  able 
to  finish. If  a  system  does  not  employ  some  protocol  that  ensures 
that  no  deaalock  will  ever  occur, then  a  detection  and  recovery 
tcheme  must  be  implemented.  In  such  cases  an  algorithm  that  examines 
the  state  of  the  system  is  invoked  periodically  to  determine  whe- 
ther a  deadlock   has  occured.If  so,  the  system  must  attempt  to  re- 
cover from  the  deadlock. 

The  work  of  [jiolt,  71  ,  72  ^  showed  that  the  problem  of  deadlock 
detection  can  be  stated  as  a  directed  graph  problem   over  a  bipar-  ■ 
tite  digraph  representing  the  current  state  of  requests  and  allo- 
cations of  resource  units  to  processes  (the  resource  graph, [Holt, 
71^). The  wellknown  banker's  algoritliin   (see  fbijkstra,  6?J  ,  Qiaber- 
mann,692]  )  is  perhaps  the  oldest  the  most  well  known  seciaential 
algorithms  for  deadlock  detection  which  can  be  made  to  run  in  0 
(r.nlogn)  time,  here  n  is  the  number  of  processes  and  m  the  number 
of  resources  (or  resource  types,  [j'eterson,Silberschatz  ,84"]  )  in 
the  system. 

Deadlock   detection  becomes  more  challenging  in  the  multipro- 
cessor systems  of  today  not  only  because  the  size  of  the  problem  is 
considerably  increased, but  also  because  one  may  use  the  parallelism 
of  the  system  to  construct  more  efficient  parallel  detection  algo- 
rithms. It  has  been  argued  by  many  researchers  that  ultimate  multi- 
processor performance  will  require  parallel   computation, even  at 
the  level  of  Operating  system  calls  (see  e .g. pchwartz , 80^  ) .This 
fact, together  with  the  richness  of  result  on  parallel  graph  algo- 
rithms lead  us  to  examine  the  possibility  of  existence  of  efficient 
parallel  algorithms  for  deadlock   detection .The   theoretical  model 
in  rind  is  the   P-F/J'  of  Q  V.yllie,  7£j  with  the   resource  graph  stored 
in  shared  memory. 

Issues  of  sequential  complexity  of  deadlock'  avoidance   pro- 
blems were  examined  by  [JDevillers  ,  77^,  [Gold,  78   and   Minoura,8r]. 
A  formal  treatment  of  deadlock    can  be  found  in  [jiolt,72^  also  in 
[]shaw,74]]  and  ["Cof  fman.  Denning  ,  73^].  Distributed  deadlock   detection 
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algorithms  were  given  by  [Rosenkr  antz  et  .al .  ,  78]  ,  [Menasce,M\intz  ,7  9] 
and  also  [Gligor ,  Shattuck,  80]  and  [Obermark,  82] -The  work  of  [l^ame- 
da,80^  examined  the  problem  of  testing  the  deadlock  freedom  of 
Computer  Systems  .However  there  is  no  previous  work  on  the  parallel 
complexity  of  deadlock',  detection. 

In  this  paper, we  first  prove  that  general  deadlock  detection 
problem  (over  many  serially  reusable  resources  with  more  than  one 
units  each)  is  complete  in  P (i .e .  the  class  of  problems  which  can 
be  solved  in  polynomial  time)  under  log-space  reductions. Hence  we 
provide  strong  evidence  that  the  variations  of  banker's  algorithm 
cannot  be  signi.f icantly  sped  up  by  the  use  of  any  reasonable  number 
of  parallel  processors .( Theoretically, some  speed  up  is  always  pos- 
sible, as  in  [pymond,83]]  and  [Reif  82]  but  it  requires  an  exponential 
number  of  processors  in  general) . (Other  problems  complete  in  P  under 
log  space  or  NC^  redvictions  can  be  found  in  [bobkin  et  al,79]  [Gold- 
schlaqer  et  al  82  H  ,  [L-^nder ,  75^[jones,Laaser ,  77]]  . 

Although  deadlock  detection  may  not  be  efficiently  parallelized 
in  the  general  case,  special  cases"  which  involve   restrictions  on  the 
allocators ,  the  number  of  resource  units  requested  simultaneously  and 
on  the  number  of  resource  units  per  resource  can  be  handled  efficien- 
tly by  parallel  algorithms .We  prove  that  the  deadlock  detection  pro- 
blem for  the  ca-rses  of  immediate  allocations,  single  unit  resources 
and  single  unit  requests, is  in  NC  (the  cl^ss  of  functions  computable 
in  parallel  polylogarithmic  time  with  a  polynomial  number  of  proces- 
sors) .Thus  it  saems  that  the  designers  of  parallel  operating   systems 
will  have  to  make  decisions  about  restricting  the  generality  of  re- 
source allocation  schemes  in  order  to  gain  in  efficiency. 

2 .   The  model. 

We  use  the  model  of  [Holt,7Z]  (see  also  [Shaw,  74]  )  .The  state  of 
the  system  is  represented  by  the  reusable   resource  multigraph 
D  (V,E)  where  V  is  the  union  of  the  set  n  of  process  nodes  (each  no- 
de p.eix  being  a  process)  and  of  the  set  p  of  resource  nodes  (each 
node  R.ep  denotes  a  serially  reusable  resource)  .Let  |  n.  |  =n   and  |p|=m 
The  directed  multigraph  D  is  bipartite   with  respect  to u and  p. Each 
arc  eeE  is  either  of  the  type   e=(p.,R.)  (in  which  case  e  is  a  re-  . 
quest   edge  and  is  interpreted  as  a  request  by  p.  for  1  unit  of  R.) 
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or  of  the  type   e=(R. ,p^) (in  which  case  e  is  an  assignment  edge 

and  indicates  an  allocation  of  1  unit  of  R.  to  p.). 

J      1 

For  each  resource  R.Sp  , there  is  a  nonegative  integer   t.  de- 
noting the  number  of  units  of  R. .The  system  must  always  work  within 
the  following  limitations : (a)  No  more  than  t.  assignments  (alloca- 
tions) may  be  made  for  R.  (for  all  i=1,2,...,m)  and  (b)  the  sum  of 
the  requests  and  allocations  of  any  process  for  a  particular  resour- 
ce cannot  exceed  the  available  units. 

In  the  following, we  will  denote  by  | (a,b) | ,the  number  of  edges 
directed  from  node  of  to  node  b. 

The  multigraph  D  represents  the  current  system  state. D  changes 
to  a  new  state  only  through  requests, releases  or  acquisitions  of 
resources  by  one  or  more  processes. These  operations  follow  some 
rules: 

(1)  A  process  p.  may  request  at  once  any  number  of  resources  (in- 
cluding any  number  of  units  of  a  particular  resource)  subject 
to  limitations  (a)  and  (b) , provided  that  Pj  has  no  requests 
outstanding .' 

(2)  If  a  process  p.  has  outstanding  requests  and  all  such  requests 
can  be  satisfied, then  a  possible  change  of  state  is  an  acquisi- 
tion, where  all  request  edges  of  p.  reverve  dire.ction. 

(3)  If  a  process  p.  has  no  requests  outstanding  then  p.  may  release 
any  nonempty  subset  of  resource  units  (the  corresponding  edges 
are  deleted) . 

In  parallel  systems, more  than  one  operations, (as  above)  may  hap- 
pen concurrently, provided  that  they  do  not  conflict  with  each  other 
(i.e.  they  follow  (a)  and  (b) ) . 

It  is  important  to  note  that  processes  are  nondeterministic  ; 
subject  to  the  above  restrictions, any  operation  by  any  subject  of 
processes  is  possible  at  any  time. 

3.    The  general  deadlock   detection  problem  ant  its  parallel 
complexity. 

The  general  deadlock  .  detection  problem  is  the  following: 
Given  the  multigraph  D=(V,E)  with  V=TiUp  an'd^n={pi  ,  .  .  .  ,p  }, 
p={Ri,...,R  }   and  the  set  T={  ti,...,t  }  of  units_^  is  D  a  deadlock 
state,  i.e.  is  there  in  D  any  subset  S  of  the  set  u  such  that  the 
processes  in  S  cannot  change  state  and  will  never  change  state  in 
the  future  ? 
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R.Holt   , in  [Holt, 71b],  examined  the  above  problem  in  depth. 
We  nou  quickly  summarize  the  key  notions  and  theorems  proved  in 
L Holt, 71 b] .They  will  be  used  in  this  paper. 
Definition  3  . 1  .  [Holt,  71  bT] 

A  process  p.  is  blocked  in  a  state  D  when  p.  has  outstanding 
requests  which  cannot  be  satisfied  in  D. 
Definition  3.2.  [Holt,7l£] 

Multigraph  D  is  reduced  by  a  process  p.  (which  is  neither  bloc- 
ked nor  an  isolated  node) , by  removing  all  edges  to  and  from  p. .(This 
is  interpreted   as  p.  aquiring  any  resources  for  which  it  has  pending 
requests  and  then  releasing  all  of  its  resourcess .Then  p.  becomes 
an  isolated  node) . 
Definition  3.3.  i]lolt,71b] 

The  multigraph  D  is  called  irreducible  if  it  cannot  be  reduced 
by  any  process. 
Definition  3.4.  [Holt, 71b] 

D  is  called  completely  reducible,  iff  there  exists  a  sequence  of 
reductions  that  deletes  all  edges  of  the  graph. 
Theoren^  3.1.  [Holt, 71b] 

All  reduction  sequences  of  a  given  reusable  resource  multigraph 
D  lead  to  the  same  irreducible  multigraph. 
Theorem  3.2.  [Holt, 71 bj  (Deadlock  Theorem) 

D  is  a  deadlock  state  iff  D  is  not  completely  reducible. 

We  note  that  the  above  results  of  Holt  give  a  quick  sequential 
algorithm  for  testing  if  D  is  completely  reducible  (and  hence  , if 
a  deadlock  exists) . 
Definition  3.3.  (See  e.g.  [Reif,82] 

Let  L  and  L'  be  languages  over  an  alphabet  E  .We  say  L'  is  log- 
space  reducible  to  L,if  there  exists  a  function  f  such  that. 

* 

(a)  for  each   coSE  ,  coEL'  iff   f  (co)  SL 

(b)  f  is  computable  in  log  space  by  a  deterministic  Turing  Machine. 
Definition  3.4.  (See  e  .g  .  [Reif  ,  Sf] 

Let  P  be  the  class  of  languages  accepted  in  deterministic  polyno 
mial  time  by  Turing  Machines. L  is  Complete  in  P  under  log-space  redu- 
ctions if : 

(a)  L  e  P  and 

(b)  for  each  L'e  P,L'  is  log-space  reducible  to  L. 

Note:  It  is  known  [Hopcrof t ,Dllman,  79]  that  log-space  reducibility 
is  transitive.  We  now  state  the  first  main  result  of  this  paper: 


I 
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Theorem  3.3. 

The  general  deadlock  detection  problem  is  complete  in  P  under 
log-space  reductions. 
Proof. 

We  will  use  a  reduction  from  the  following  version  of  the  cir- 
cuit value  problem:  (See  [Ladner ,  75^  ) 

Define  a  boolen  circuit  to  be  a  sequence  B=(B_,.../B  )  where  B.- 
■'  on         1 

is  either  true  or  false   or  an  expression  op  (B.-,B.  )  where 
ii,i2<i   and  "op"  is  AND  or  OR. 

Let  value  (B .) =op (value  (B .  ),  value (B.  ) ) .Let  value  (B)=value(B  ) 

1  1 1  i2  n 

The  circuit  value  problem  with  AND  and  OR  is:  Given  a  boolean 
circuit  B(as  above), test  if  value  (B)=  true .  [Ladner, 75j  has  shown  that; 
Lemma  3.1.  [Ladner, 75j  . 

The  circuit  value  problem  with  AND  and  OR  is  complete  in  P 
under  log-space  reductions. 

Let  us  use  boxes  to  represent  resouces  and  circles  to  repre- 
scent  processes. Little  circles  inside  resource  boxes  represent  units. 
(Fig.1) . 


Fig.1.  Process  piholds  one  unit  of  resourse  Riand  asks  for  2  units 
of  R2.R2  has  3  units, two  of  them  are  allocated. 


For  the  purposes  of  the  reduction, we  will  use  one  process  P.  for 
each  variable  B.,  i=1,...,n.An  additional  special  proces P  will  be 
used. Given  an  instance  of  the  circuit  value  problem, we  work  as  fol- 
lows : 

(1)    The  case  B.:  false  is  represented  by  having  the  process  P.  to 
ask  for  the  single  unit  of  resource  RF., which  is  however  allo- 
cated to  P^  (Fig. 2) . 
o 
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Fig. 2 . :  B . :  false 


(2)   The  case  B.:  true  is  represented  by  having  the  process  P. 

aski^n^  for  nothing  but  being  granted  a  resource  unit  from  the 
resource  RT .  (which  has  a  single  unit) .We  also  set  P   to  ask 


for  that  unit  (Fig. 3). 


^-^i 


C> 


■^ 


Fig. 3 . :  B . :  true 


r. 


/ 


^: 


(3)   The  case  B.:  B.  x\ND  D   (j,k^i)  is  devoted  by  having  process 

P.  to  ask  for  two  single  unit  resources, each  being  kept  by  one 
of  B  .  ,Bj^.  (Fig. 4)  . 


Fig.  4.  :  B  .  :  B  .  AND  B, 
-    1    ]      k 


Note  state  P.  can  be  resuced  only  if  both  P.,P,  can. 
1  •'     ■}      k 

(4)   The  case  B . :B  .  OR  B   (j,k^i)  is  denoted  by  having  process  P. 

1    J        K  1 

to  ask  for  one  unit  of  a  two-unit  resourve  (R.  .,  )  .One  unit  of 

ijk 


R. 


ijk 


IS  given  to  P.  and  one  to  P,  (Fig. 5) 
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Fig.5. :  B  :=B .OR  B 
i    D     k 

Note:  that  P . can  be  reduced  only  if  at  least  one  of  P.,P,  can. 


(5)   We  also  show  (in  Fig. 6)  how  to  repeat  the  same  AND  or  OR 
statement:  (Fig. 6).  .  . 


P.- 


Pi. 


Fig. 6.1 . :  B .  :=B .AND  B 


k' 


Bi:  :=B  .AND  B, 
J      k 


Fig. 6. 2. :  B .  :=B .OR  B, 

^         ii 


B.  :=B  .OR  B, 
12    j     k 
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(6)   Finally  we  let  process  P   to  ask  for  all  units  of  all  resour- 
ces  of  the  type  R  .  or  type  R.  .,  . 

It  should  be  clear  that  the  transformation  of  B  to  the  resour- 
ce multigraph  D  created  by  ruleg  (1 ),...( 6)  can  be  done  in  Oflocmy 
space  only. 

Claim  3.1.  P   can  be  reduced  iff  value (B  )=true. 
o  n 

Proof  of  Claim  (Sketch) 

P   is  blocked  only  by  the  resources  held  by  P  (which  corres- 
ponds to  ^  ) .From  our  construction, a  process  P.  will  be  reduced 
either  if  B .  is  set  to  true  initially  or  if  value  (B . ) =true  (a  small 

induction  on  i  is  needed  here) .Hence  value  (B  ) =true  is  equivalent 

n 

to  the  statement  "P  will  be  reduced"  which  implies, that  the  special 

process  P   can  be  reduced  iff  value (B  )=true. 
o  n 

Claim  3.2. 


P   can  be  reduced  iff  D„  is  completely  reducible, 
o  B       ^     -^ 

Proof  (Sketch) 

If  P   cannot  be  reduced  then  clearly  D  is  irreducible. 

o  -^   B  - 

If  P  can  he  reduced, then  all  initially  ''false"  processes  can  be 
c  -  - 

reducec".   (they  are  blocked  only  by  p  )  .Since  a  prooess  P.  is  blocked  by 
previously  "defined"  processes  every  P.  can  then  be  reduced. 

To  complete  the  proof  of  Theorem  3. 3., we  con.bine  the  two  claims 
to  prove  that:B   true   iff  D„  is  completely  reducible. 


Remark;  We  just  showed  that  it  is  unlikely   for  the  general  deadlock 
detection  problem  to  be  solved  efficiently  (e.g.  in  time  (logn    ) 
in  parallel .This  is  a  serious  constraint  for  the  design  of  parallel 
operating   systems .However , our  next  section  shows  that  more  restric- 
ted cases  of  deadlock   detection  are  efficiently  parallelizable. 

4.   The  Restricted  Deadloc]:   Detection  Problem. 

4.1. The  Case  of  Single-Unit  Resources. 

Assume  that  each  resource  has  one  unit, i.e. 
t^l  ,  iri  ,...  ,m.rHolt,7lbJ,  showed  that: 

Theorem  4 . 1 .THolt , 71 bl 

A  reusable  resource  (multi)  graph  with  single  unit  resources  is 
a  deadlock"  state  iff  it  contains  a  cycle. 
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Cycles  are  easily  detected  in  parallel. One  can  use  Breadth- 
first-search,  as  in  [Jaja,78]  to  compute  the  strong  components  of 
the  multigraph  D  in  parallel  (by  using  a  CREW  PRAM)  in  0 (log  n, 
log  d)  time, where  d  is  the  depth  of  the  digraph  (max  of  shortest 
directed  simple  paths),  by  the  use  of  0(nVlog  n)  processors. 

The  existence  of  a  cycle  is  equivalent  to  a  nontrivial  strong 
component . Hence , 

Lemma  4.1. 

Deadblock  detection  for  single  unit  resources /is  in  NC. 

4.2.  Immediate  allocations  and  single-unit-requests. 

If  the  system  states  are  such  that  all  satisfiable  requests 
have  been  granted  (e.g.  when  the  resource  allocators  grant  satis- 
fiable requests  immediately)  , we  call  then)  expedient  states  (see 
Qiolt, 71 b] ) .Let  us  assume  that  the  reusable  resource  multigraph 
D  is  expedient  and  let  us  furthermore  assume  that  processes  may 
request  only  one  unit  at  a  time  (i.e.  at  most  one  request  edge 
can  be  connected  to  any  process  node, or,  equivalently  the  out  deg- 
ree of  process  node^is^l). 
Definition  4.1. 

A  knot  in  a  directed  graph  or  directed  multigraph  is  a  subset 
S  of  the  set  of  nodes  V  such  that 

(1)  all  vertices  of  S  are  reachable  by  each  vertex  of  S  and 

(2)  no   vertex  of  S  can  reach  a  vertex  outside  S  (there  are  no 
edges  going  out  of  S) . 

Remark  4.1. 


A  knot  of  a  digraph  is  a  strongly  connected  component , with  no 
edges  going  out  of  the  strongly  connected  component. 
Theorem  4  , 1  .  [Holt ,  71  B]  . 

A  reusable  resource  multigraph  D  (V,E)  which  is  expedient  and 
has  only  single  unit  requests  is  a  deadlock  state  iff  it  contains 
a  nontrivial  knot. 

In  the  work  of  [^JaJa,78^  and  rSavage,JaJa  81"],'  it  is  showed 

2   '  ~ 

how  to  get  O(log  n)  parallel  algorithms  for  the  construction  of 

strongly  connected  components  of  digraphs. These  algorithms  run  in 
CRCW  P  RAM  machines  and  produce  a  data  structure  (in  shared  memory) 
which  allow  any  two  nodes  of  the  digraph  to  test  whether  they  be- 
long to  the  same  strong  component , in  constant  time. The  number  of  pro- 

3 
cessors  required  is  0(n  /logn)  . 
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Remark  4.2. 

Let  D  (V,E)  be  a  reusable  resource  multigraph  which  is  expedient 
and  has  been  constructed  trhcu<-_-h.sirigle-unit  requests.  Let  D'  be  the  di- 
rected graph  constructed  from  D  (by  replacing  each  nonempty  set  of 
directed  edges, which  join  two  nodes  and  have  the  same  direction  by  a 
single  directed  edge. Then  D,  has  a  knot  iff  D'  has  a  knot. 
Proof.  Easy. 
Lemma  4.2. 

Let  be  a  reusable  resource  multigraph  which  is  expedient  and  is  a 

state  of  system  with  single-unit  requests. Then, we  can  detect  deadlocks 

3  2 

in  D,by  using  a  CRCW  P-RAM  of  0(n  /logn)  processor^ in  O(log  n)  paral- 
lel time. 
Proof. 

The  PRAM  constructs  D'(as  in  Remark  4.2.)  out  of  D,by  assigning 
one  processor  pVf     edge   of  D  and  doing  a  logical-OR  operation, for 
the  set  of  edges  of  the  same  direction  for  each  vertex  pair. (Logical 
OR  can  ine   doiie  in  0(1)  parallel  time  in  CRCW-PRAMS  see  CjaJa,78j). 

Tnen  the  PRAM  runs  the  strong  components  parallel  algorithm  of 
I  Savage, JaJa , 81 J . 

At  the  end, each  edge  processor  tests  in  parallel, (in  0(1) time) 
whether  the  two  vertices  (which  are  conected  by  the  edge)  belong  to 
different  strongly  connected  components . If  so,  then  the  strong  compo- 
nent of  the  origin  of  the  directed  edge  is  marked. Marked  strong  compo- 
nents cannot  be  knots. Then  in  0(1)  parallel  time  (thru  a  logical  OR 
operation)  the  PRAM  checks  in  parallel  whether  one  or  more  knots  have 
been  found . 
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